SemanticScuttle - klotz.me » Tags: edge computing

Tags: edge computing*

0 bookmark(s) - Sort by: Date ↓ / Title /

The M.2 Max is an AI inference acceleration card powered by the Metis AIPU, designed to enable Large Language Models (LLMs) and Vision Language Models (VLMs) on power-constrained edge and embedded devices. It offers high memory performance in a small footprint and supports complex computer vision tasks using parallel or cascaded models.
Key features include:
- Memory capacities up to 16 GB with various cooling options.
- Support for standard and extended operating temperature ranges.
- Hardware Root-of-Trust for secure boot and firmware integrity.
- Integration via the Voyager SDK and advanced quantization tools.
- Compatibility with PCIe Gen. 3.0 x4, Intel, AMD, and Arm64 processors across Linux and Windows environments.

2026-04-16 Tags: m.2 max, axelera ai, metis aipu, ai inference acceleration, llm, vlm, edge computing, computer vision by klotz

PrismML debuts energy-sipping 1-bit LLM in bid to free AI from the cloud

PrismML, a venture originating from Caltech, has introduced its new 1-bit large language model, Bonsai 8B, designed to significantly enhance AI efficiency on edge hardware. This innovative model architecture represents weights using only their sign and a shared scale factor, resulting in a memory footprint of just 1.15 GB. Compared to full-precision models, Bonsai 8B is 14 times smaller, 8 times faster, and 5 times more energy-efficient, while maintaining competitive performance. By drastically reducing memory and power requirements, PrismML aims to enable advanced AI applications on mobile devices, real-time robotics, and secure enterprise systems, effectively moving powerful language models out of massive cloud datacenters and onto local hardware.

2026-04-05 Tags: prismml, bonsai 8b, 1-bit llm, edge computing, quantization, caltech, machine learning, llm by klotz

Bringing AI Closer to the Edge and On-Device with Gemma 4

NVIDIA has launched the Gemma 4 model family, designed to operate efficiently across a wide range of hardware, from data centers to edge devices like Jetson. This new generation includes the first Gemma MoE model and supports over 140 languages, enabling advanced capabilities like reasoning, code generation, and multimodal input.
Developers can fine-tune and deploy Gemma 4 using tools like NeMo Automodel and NVIDIA NIM, with commercial licensing available. The models are optimized for local deployment with frameworks such as vLLM, Ollama, and llama.cpp, offering flexibility for various use cases, including robotics, smart machines, and secure on-premise applications.

2026-04-03 Tags: agents, edge computing, blackwell, dgx, jetson, nemo, llm by klotz

Lightspeed Data Compute for the Space Era

This paper proposes SpaceCoMP, a MapReduce-inspired processing model for LEO satellite mesh networks, addressing the challenge of downlink bandwidth limitations by processing data in orbit. It leverages orbital dynamics and proposes optimizations for routing and task scheduling to improve data processing efficiency.

2026-02-03 Tags: leo satellites, edge computing, mapreduce, task scheduling, orbital networks, data processing, satellite communication, bernardo huberman by klotz

reTerminal DM & Machinechat JEDI: Your Industrial IoT Powerhouse

This article details how to set up and use Machinechat JEDI with the Seeed Studio reTerminal DM for industrial IoT applications, including hardware/software preparation, installation, data pipeline creation, visualization, and MQTT integration.

2026-01-26 Tags: reterminal dm, machinechat jedi, iiot, iot, mqtt, raspberry pi, esp32, dashboard, visualization, edge computing by klotz

Orange Pi Unveils AI Station with Ascend 310 and 176 TOPS Compute

Orange Pi has announced the Orange Pi AI Station, a compact edge computing platform featuring the Ascend 310 processor, offering up to 176 TOPS of AI compute performance with options for up to 96GB of LPDDR4X memory and NVMe storage.

2025-12-31 Tags: orange pi, llm, ascend 310, edge computing, sbc, single board computer, ai, inference, lpddr4x, nvme by klotz

Build a Fast Offline AI Assistant on a Raspberry Pi 5

This article details how to build a fast, offline AI chatbot using a Raspberry Pi 5, RLM AA50 accelerator card, and optimization techniques for speech recognition, natural language processing, and text-to-speech tasks.

2025-12-31 Tags: raspberry pi 5, llm, whisper, qwen-3, mellotts, edge computing, ai chatbot by klotz

A ferroelectric-memristor memory for both training and inference

A unified memory stack that functions as a memristor as well as a ferroelectric capacitor is reported, enabling both energy-efficient inference and learning at the edge.

2025-09-23 Tags: ferroelectric memory, memristor, artificial intelligence, edge computing, in-memory computing, neural networks, training, inference by klotz

SkyMemory: A LEO Edge Cache for Transformer Inference Optimization and Scale Out

This paper proposes SkyMemory, a LEO satellite constellation hosted key-value cache (KVC) to accelerate transformer-based inference, particularly for large language models (LLMs). It explores different chunk-to-server mapping strategies (rotation-aware, hop-aware, and combined) and presents simulation results and a proof-of-concept implementation demonstrating performance improvements.

2025-08-10 Tags: low-earth orbit, bernardo huberman, cache, transformer, inference, llm, key-value, edge computing, satellite communication by klotz

Federated Language Models: SLMs at the Edge Plus Cloud LLMs

The article introduces the concept of Federated Language Models, combining edge-based Small Language Models (SLMs) with cloud-based Large Language Models (LLMs) for enhanced privacy and performance in AI applications.

2024-07-10 Tags: large language models, edge computing, cloud computing, data security, slm, llm, federated learning, privacy by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: edge computing*

Linked Tags

Related Tags